Stabilizing Policy Improvement for Large-Scale Infinite-Horizon Dynamic Programming
نویسندگان
چکیده
منابع مشابه
Stabilizing Policy Improvement for Large-Scale Infinite-Horizon Dynamic Programming
Today’s focus on sustainability within industry presents a modeling challenge that may be dealt with using dynamic programming over an infinite time horizon. However, the curse of dimensionality often results in a large number of states in these models. These large-scale models require numerically stable solution methods. The best method for infinite-horizon dynamic programming depends on both ...
متن کاملSYSTEMS OPTIMIZATION LABORATORY DEPARTMENT OF MANAGEMENT SCIENCE AND ENGINEERING STANFORD UNIVERSITY STANFORD, CALIFORNIA 94305-4026 Stabilizing Policy Improvement for Large-Scale Infinite-Horizon Dynamic Programming
Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the sponsors. Reproduction in whole or in part is permitted for any purposes of the United States Government. This document has been approved for public release and sale; its distribution is unlimited. Abstract. Today's focus on sustainabi...
متن کاملSolving infinite horizon optimal control problems of nonlinear interconnected large-scale dynamic systems via a Haar wavelet collocation scheme
We consider an approximation scheme using Haar wavelets for solving a class of infinite horizon optimal control problems (OCP's) of nonlinear interconnected large-scale dynamic systems. A computational method based on Haar wavelets in the time-domain is proposed for solving the optimal control problem. Haar wavelets integral operational matrix and direct collocation method are utilized to find ...
متن کاملsolving infinite horizon optimal control problems of nonlinear interconnected large-scale dynamic systems via a haar wavelet collocation scheme
we consider an approximation scheme using haar wavelets for solving a class of infinite horizon optimal control problems (ocp's) of nonlinear interconnected large-scale dynamic systems. a computational method based on haar wavelets in the time-domain is proposed for solving the optimal control problem. haar wavelets integral operational matrix and direct collocation method are utilized to ...
متن کاملInfinite-Horizon Policy-Gradient Estimation
Gradient-based approaches to direct policy search in reinforcement learning have received much recent attention as a means to solve problems of partial observability and to avoid some of the problems associated with policy degradation in value-function methods. In this paper we introduce GPOMDP, a simulation-based algorithm for generating a biased estimate of the gradient of the average reward ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: SIAM Journal on Matrix Analysis and Applications
سال: 2009
ISSN: 0895-4798,1095-7162
DOI: 10.1137/060653305